Finding Every Medical Terms by Life Science Dictionary for MedNLP

نویسندگان

  • Shuji Kaneko
  • Nobuyuki Fujita
  • Hiroshi Ohtake
چکیده

We have been developing an English-Japanese thesaurus of medical terms for 20 years. The thesaurus is compatible with MeSH (Medical Subject Headings developed by National Library of Medicine, USA) and contains approximately 30 thousand headings with 200 thousand synonyms (consisting of the names of anatomical concepts, biological organisms, chemical compounds, methods, disease and symptoms). In this study, we aimed to extract medical terms as many as possible from the test data by a simple longest-matching Perl script. After changing the given UTF-8 text to EUC format, the matching process required only 2 minutes including loading of a 10 MB dictionary into memory space with a desktop computer (Apple Mac Pro). From the 0.1 MB test document, 2,569 terms (including English spellings) were tagged and visualized in a color HTML format. Particularly focusing on the names of disease and symptoms, 893 terms were found with several mistakes and missings. However, this process has a limitation in assigning ambiguous abbreviations and misspelled words. The simple longest-matching strategy may be useful as a preprocessing of medical reports.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Finding Specific Medical Terms Using the Life Science Dictionary for MedNLP

We have been developing an English-Japanese thesaurus of medical terms for the past 20 years. The thesaurus is compatible with MeSH (Medical Subject Headings, developed by the National Library of Medicine, USA) and contains approximately 30,000 headings with 200,000 synonyms (consisting of the names of anatomical concepts, biological organisms, chemical compounds, methods, diseases and symptoms...

متن کامل

NECLA at the Medical Natural Language Processing Pilot Task (MedNLP)

This paper gives an overview of NECLA’s submitted systems for the De-Identification and Complaint & Diagnosis subtasks of the Medical Natural Language Processing Pilot Task (MedNLP)[5]. Our systems combine features derived from Part of Speech (POS) tags, a domain-specific dictionary, the Unified Medical Language System (UMLS) metathesaurus and semantic network, and a small set of heuristics bas...

متن کامل

An Trial Report to NTCIR10 MedNLP: Extracting Medical Diagnostic Term by Machine Learning

This paper explains our approach toward NTCIR10-MEDNLP[1] tasks and what kind of problem we have encountered. We have select term extraction tasks since we have some experience about keyword extraction[2]. Since it is hard to build accurate dictionary or lexicon for medical term, we aimed to use machine learning and large amount of roughly tagged medical corpus as learning data. However, we are...

متن کامل

kyoto: Kyoto University Baseline at the NTCIR-11 MedNLP-2 Task

Since more electronic records are now used at medical scenes, the importance of technical development for analyzing such electronically provided information has been increasing significantly. This NTCIR-11 MedNLP-2 Task is designed to meet this situation. This task is a shared task that evaluates natural language processing technologies especially on Japanese medical texts. The task has three s...

متن کامل

Study and Recognition of Muslim Sage Abdullah Azdi and His Medical Dictionary Called “Kitāb Al-ma”

This study seeks to identify one of the pioneers of traditional clinical medicine named Abdullah Azdi and his medical dictionary. This research is an analytical study. The focus of the search was on two keywords, Abdullah Azdi and Kitab al-Ma'ma, but the scope of the search included all appropriate terms such as: medicine, Bu Ali Sina, traditional medicine, medical dictionary, ethics, and medic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013